-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using static for LUTs #354
base: master
Are you sure you want to change the base?
Conversation
This seems to improve timing as well (unless I'm behind on the latest numbers)
|
@@ -7,7 +7,7 @@ | |||
|
|||
template<int kNBitsBuffer> | |||
static const ap_uint<(1 << (2 * kNBitsBuffer))> nearFullUnit() { | |||
ap_uint<(1 << (2 * kNBitsBuffer))> lut; | |||
static ap_uint<(1 << (2 * kNBitsBuffer))> lut; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bryates For this to work, I think you also need logic to prevent the reinitialization of the LUT. For example:
template<int kNBitsBuffer>
static const ap_uint<(1 << (2 * kNBitsBuffer))> nearFullUnit() {
static ap_uint<(1 << (2 * kNBitsBuffer))> lut;
static bool initialized = false;
if (!initialized) {
initialized = true;
for(int i = 0; i < (1 << (2 * kNBitsBuffer)); ++i) {
#pragma HLS unroll
ap_uint<kNBitsBuffer> wptr, rptr;
ap_uint<2 * kNBitsBuffer> address(i);
(rptr,wptr) = address;
auto wptr1 = wptr+1;
auto wptr2 = wptr+2;
bool result = wptr1==rptr || wptr2==rptr;
lut[i] = result;
}
}
return lut;
}
Of course, this applies to the other functions in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bryates You might also remove the static const
on the return types of these functions, since they don't really have any effect.
@bryates After addressing the comment above, could you rerun the out-of-context implementation and also do a comparison of the resource utilization, before your changes vs. after? Just to make sure nothing unexpected is happening during C-synthesis. |
These changes reduced the resource utilization a bit. The post-implementation timing is about the same.
|
This is an alternative to #351 to address an issue where CMSSW wastes CPU cycles for emptyUnit and nearFull3Unit. All luts are now declared as static, so this should fix the issue.